fix: Add support for symbolic links in file compression#349
Conversation
Reviewer's GuideAdds first-class support for symbolic links to the pzip archiver by tracking symlink metadata in FileTask, reading symlink targets during reset, and updating Archiver’s compression and ZIP header logic to store link targets correctly with appropriate sizes, CRC, and Unix attributes. Sequence diagram for Archiver::compress with symlink handlingsequenceDiagram
actor User
participant Archiver
participant FileTask
participant FileSystem
User->>Archiver: compress(task)
Archiver->>FileTask: check isSymlink
alt task isSymlink
Archiver->>FileTask: read symlinkTarget
Archiver->>FileTask: write(targetData, targetSize)
Archiver->>FileTask: set header.crc32 for target
Archiver-->>User: Error()
else regular file
Archiver->>FileSystem: open file(task.path)
FileSystem-->>Archiver: file stream
loop read and compress
Archiver->>FileSystem: read chunk
Archiver->>FileTask: write(compressedChunk)
end
Archiver-->>User: Error()
end
Class diagram for updated FileTask and Archiver symlink supportclassDiagram
class FileTask {
fs_path path
fs_file_status status
uintmax_t fileSize
ZipFileHeader header
bool isSymlink
string symlinkTarget
z_stream* compressor
Error reset(fs_path filePath, fs_path relativeTo)
Error write(uint8_t* buffer, size_t size)
}
class ZipFileHeader {
string name
uint32_t crc32
uint32_t compressedSize
uint32_t uncompressedSize
uint16_t method
uint16_t flags
uint32_t externalAttr
}
class Archiver {
Error compress(FileTask* task)
void populateHeader(FileTask* task)
}
Archiver --> FileTask : uses
FileTask o-- ZipFileHeader : has
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The relative-path computation for symlinks in
resetdoes manualstd::stringprefix checks with'/', which will break on Windows-style\separators; consider normalizing separators or usingfs::relativewhere possible to avoid platform-specific bugs. - Storing the symlink target as
fs::read_symlink(path).string()and writing that verbatim into the ZIP may embed platform-specific path formats; consider normalizing to a consistent ZIP-style path (e.g., forward slashes) or storing it asfs::pathand converting viautils::toZipPathwhen writing. - When setting
externalAttrfor symlinks you currently hardcode0120777permissions; if you care about preserving original permission bits, you may want to derive the mode fromstatus.permissions()and OR it withS_IFLNKinstead of using a fixed value.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The relative-path computation for symlinks in `reset` does manual `std::string` prefix checks with `'/'`, which will break on Windows-style `\` separators; consider normalizing separators or using `fs::relative` where possible to avoid platform-specific bugs.
- Storing the symlink target as `fs::read_symlink(path).string()` and writing that verbatim into the ZIP may embed platform-specific path formats; consider normalizing to a consistent ZIP-style path (e.g., forward slashes) or storing it as `fs::path` and converting via `utils::toZipPath` when writing.
- When setting `externalAttr` for symlinks you currently hardcode `0120777` permissions; if you care about preserving original permission bits, you may want to derive the mode from `status.permissions()` and OR it with `S_IFLNK` instead of using a fixed value.
## Individual Comments
### Comment 1
<location> `3rdparty/pzip/src/file_task.cpp:122-128` </location>
<code_context>
- if (ec) {
- // 如果无法计算相对路径,使用文件名
- header.name = path.filename().string();
+ std::string pathStr = path.string();
+ std::string relToStr = relativeTo.string();
+ if (!relToStr.empty() && relToStr.back() != '/') {
+ relToStr += '/';
+ }
+
+ if (pathStr.find(relToStr) == 0) {
+ // path 以 relativeTo 开头,直接截取
+ header.name = utils::toZipPath(pathStr.substr(relToStr.size()));
</code_context>
<issue_to_address>
**suggestion (bug_risk):** String-prefix check for relative path can misbehave on edge cases and is platform-sensitive.
`pathStr.find(relToStr) == 0` can match unintended prefixes (e.g., `/foo` vs `/foobar`) and breaks when separators differ (`/` vs `\` on Windows). To avoid `fs::relative` while still handling symlinks carefully, consider normalizing both paths (including separators and ensuring a path-boundary after `relToStr`) or using `fs::path` APIs up to the point of constructing the final relative string, so the prefix check is based on canonicalized components rather than raw strings.
Suggested implementation:
```cpp
// 设置相对路径名
// 注意:fs::relative() 会解析符号链接,这里使用路径组件比较避免解析符号链接
if (!relativeTo.empty()) {
// 使用 path 组件而不是字符串前缀,避免 /foo 与 /foobar 等前缀问题,
// 同时不依赖平台分隔符
auto relBegin = relativeTo.begin();
auto relEnd = relativeTo.end();
auto pathBegin = path.begin();
auto pathEnd = path.end();
// relativeTo 的组件数不能多于 path
if (std::distance(relBegin, relEnd) <= std::distance(pathBegin, pathEnd) &&
std::equal(relBegin, relEnd, pathBegin)) {
// 构造相对路径:path 去掉 relativeTo 前缀后的部分
fs::path relPath;
auto it = pathBegin;
std::advance(it, std::distance(relBegin, relEnd));
for (; it != pathEnd; ++it) {
relPath /= *it;
}
// 如果恰好相等,则仍然退回到使用文件名,以避免空名
if (!relPath.empty()) {
header.name = utils::toZipPath(relPath);
} else {
header.name = utils::toZipPath(path.filename());
}
} else {
// 回退到使用文件名
header.name = utils::toZipPath(path.filename());
}
} else {
header.name = utils::toZipPath(path.filename());
```
If this file does not already include `<algorithm>` and `<iterator>`, you should add them at the top of `3rdparty/pzip/src/file_task.cpp` so that `std::equal` and `std::distance` are available:
- `#include <algorithm>`
- `#include <iterator>`
</issue_to_address>
### Comment 2
<location> `3rdparty/pzip/src/file_task.cpp:85` </location>
<code_context>
bufferUsed_ = 0;
written_ = 0;
+ isSymlink = false;
+ symlinkTarget.clear();
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the custom relative-path logic with a helper using `lexically_relative` and extracting symlink/file-type handling into a dedicated helper to simplify `reset`.
The manual relative-path handling and the extra branching in `reset` do increase complexity; you can simplify without changing behavior.
### 1. Replace manual string-based relative path with `lexically_relative`
You can avoid mixing `std::filesystem::path` with ad‑hoc string manipulation and avoid `fs::relative()` (which resolves symlinks) by using `lexically_relative` and a small helper:
```cpp
// helper (e.g. in an anonymous namespace or a util header)
fs::path makeZipRelativePath(const fs::path& path, const fs::path& base) {
fs::path rel = path.lexically_relative(base);
// If not under base, fall back to filename (current behavior)
if (rel.empty() || rel.native().rfind("..", 0) == 0 || rel.is_absolute()) {
return path.filename();
}
return rel;
}
```
Usage in `reset`:
```cpp
// 设置相对路径名
if (!relativeTo.empty()) {
fs::path relPath = makeZipRelativePath(path, relativeTo);
header.name = utils::toZipPath(relPath);
} else {
header.name = utils::toZipPath(path.filename());
}
```
This keeps symlinks un-resolved (purely lexical arithmetic), removes assumptions about `/`, and avoids repeated `string()` calls.
### 2. Extract symlink handling out of `reset`
`reset` is doing many things now (state reset, status detection, symlink handling, size determination, relative path). You can factor out the symlink/file-type logic into a small helper to reduce branching and make `reset` easier to scan:
```cpp
// helper
Error initFileTypeAndSize(const fs::path& path,
fs::file_status& status,
bool& isSymlink,
std::string& symlinkTarget,
uint64_t& fileSize) {
std::error_code ec;
status = fs::symlink_status(path, ec);
if (ec) {
return Error(ErrorCode::FILE_NOT_FOUND, "Cannot stat file: " + path.string());
}
isSymlink = fs::is_symlink(status);
if (isSymlink) {
symlinkTarget = fs::read_symlink(path, ec).string();
if (ec) {
return Error(ErrorCode::FILE_READ_ERROR, "Cannot read symlink target: " + path.string());
}
fileSize = symlinkTarget.size();
} else if (fs::is_regular_file(status)) {
fileSize = fs::file_size(path, ec);
if (ec) {
return Error(ErrorCode::FILE_READ_ERROR, "Cannot get file size: " + path.string());
}
} else {
fileSize = 0;
}
return Error::success();
}
```
Then in `reset`:
```cpp
isSymlink = false;
symlinkTarget.clear();
path = filePath;
Error err = initFileTypeAndSize(path, status, isSymlink, symlinkTarget, fileSize);
if (!err.ok()) {
return err;
}
```
This preserves all new functionality (symlink size and target handling) but reduces complexity and concentrates path and symlink logic in focused helpers.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
- Introduced isSymlink and symlinkTarget attributes in FileTask to handle symbolic links. - Updated Archiver to compress symlink targets and adjust ZIP header properties accordingly. - Modified FileTask reset method to detect and read symlink targets, ensuring proper file size handling. Log: Enhance file compression capabilities by supporting symbolic links
e152892 to
ad1dcfc
Compare
deepin pr auto review这段代码主要实现了在 ZIP 压缩工具中添加对符号链接(Symlink)的支持。整体逻辑是正确的,能够识别符号链接、读取其目标路径,并在 ZIP 文件中正确设置 Unix 文件属性以表示这是一个符号链接。 以下是从语法逻辑、代码质量、代码性能和代码安全四个方面的详细审查意见: 1. 语法逻辑
2. 代码质量
3. 代码性能
4. 代码安全
改进建议总结
修改后的代码片段建议 (针对相对路径计算的健壮性)由于手动计算相对路径比较复杂且容易出错(特别是处理 但考虑到 C++17 // 在 file_task.cpp 中
if (!relativeTo.empty()) {
// 检查是否有共同的根目录(例如 Windows 下的 C: 和 D: 没有共同根)
if (path.root_path() != relativeTo.root_path()) {
// 如果根目录不同,无法生成相对路径,使用绝对路径或文件名
// 这里选择使用文件名作为回退,或者根据需求使用完整路径
header.name = utils::toZipPath(path.filename());
} else {
// 原有的手动迭代逻辑
auto pathIt = path.begin();
auto relIt = relativeTo.begin();
// 跳过共同前缀
while (pathIt != path.end() && relIt != relativeTo.end() && *pathIt == *relIt) {
++pathIt;
++relIt;
}
// 处理 relativeTo 剩余的部分(需要添加 ../)
// 注意:原代码缺少这部分,如果 relativeTo 比 path 深,生成的路径就是错的
fs::path relPath;
for (; relIt != relativeTo.end(); ++relIt) {
relPath /= "..";
}
// 添加 path 剩余的部分
for (; pathIt != path.end(); ++pathIt) {
relPath /= *pathIt;
}
if (!relPath.empty()) {
header.name = utils::toZipPath(relPath);
} else {
// 理论上如果完全相同,这里应该是 ".",但 ZIP 通常不需要 "."
// 或者指向文件本身
header.name = utils::toZipPath(path.filename());
}
}
}注意:上述修改增加了对 总结代码整体实现了功能,但在相对路径计算的健壮性上存在改进空间,特别是在处理非子目录路径时。同时,作为生成包含符号链接 ZIP 的工具,应明确安全责任边界。 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dengzhongyuan365-dev, lzwind The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/forecemerge |
|
/forcemerge |
Log: Enhance file compression capabilities by supporting symbolic links
Summary by Sourcery
Add support for correctly archiving symbolic links by detecting them during file task setup and treating their targets as stored data in ZIP entries.
New Features:
Enhancements: